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D ynamo: a transparent dynamic optimization system 

Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia 

May 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 2000 conference 

on Programming language design and implementation PLDI '00, Volume 35 

Issue 5 

Publisher: ACM Press 

Full text available: « pdff 156.03 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

We describe the design and implementation of Dynamo, a software dynamic optimization 
system that is capable of transparently improving the performance of a native instruction 
stream as it executes on the processor. The input native instruction stream to Dynamo 
can be dynamically generated (by a JIT for example), or it can come from the execution 
of a statically compiled native binary. This paper evaluates the Dynamo system in the 
latter, more challenging situation, in order to emphasize the ... 

The benefits and costs of DyC's run-time optimizations 

Brian Grant, Markus Mock, Matthai Philipose, Craig Chambers, Susan J. Eggers 
September 2000 ACM Transactions on Programming Languages and Systems 

(TOPLAS), Volume 22 Issue 5 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 

terms 





Full text available: |g pdf(1.59 MB) 



DyC selectively dynamically compiles programs during their execution, utilizing the run- 
time-computed values of variables and data structures to apply optimizations that are 
based on partial evaluation. The dynamic optimizations are preplanned at static compile 
time in order to reduce their run-time cost; we call this staging. DyC's staged 
optimizations include (1) an advanced binding-time analysis that supports polyvariant 
specialization (enabling both single-way and multi ... 



Keywords: dynamic compilation, specialization 




Optimization and precise exceptions in dynamic compilation 

Michael Gschwind, Erik Altman 

March 2001 ACM SIGARCH Computer Architecture News, Volume 29 issue l 
Publisher: ACM Press 

Full text available: Q pdf(508,52 KB) Additional Information: full citation , abstract , index terms 




Maintaining precise exceptions is an important aspect of achieving full compatibility with a 
legacy architecture. While asynchronous exceptions can be deferred to an appropriate 
boundary in the code, synchronous exceptions must be taken when they occur. This 
introduces uncertainty into liveness analysis since processor state that is otherwise dead 



may be exposed when an exception handler is invoked. Previous systems either had to 
sacrifice full compatibility to achieve more freedom to perform op ... 



Software profiling for hot path prediction: less is more 

Evelyn Duesterwald, Vasanth Bala 

November 2000 ACM SIGOPS Operating Systems Review , ACM SIGARCH Computer 

Architecture News , Proceedings of the ninth international conference 
on Architectural support for programming languages and operating 

systems ASPLOS-IX, Volume 34 , 28 Issue 5 , 5 

Publisher: ACM Press 

Full text available: flP pdf(286.07 KB) Addltional Information: full citation , abstract, references , citings, index 

" terms 

Recently, there has been a growing interest in exploiting profile information in adaptive 
systems such as just-in-time compilers, dynamic optimizers and, binary translators. In 
this paper, we show that sophisticated software profiling schemes that provide highly 
accurate information in an offline setting are ill-suited for these dynamic code generation 
systems. We experimentally demonstrate that hot path predictions must be made early in 
order to control the rising cost of missed opportunity tha ... 



Software profiling for hot path prediction: less is more 

Evelyn Duesterwald, Vasanth Bala 

November 2000 ACM SIGPLAN Notices, Volume 35 issue n 
Publisher: ACM Press 

Full text available: ^ pdf(2.43 MB) Additional Information: full citation , abstract , references , index terms 

Recently, there has been a growing interest in exploiting profile information in adaptive 
systems such as just-in-time compilers, dynamic optimizers and, binary translators. In 
this paper, we show that sophisticated software profiling schemes that provide highly 
accurate information in an offline setting are ill-suited for these dynamic code generation 
systems. We experimentally demonstrate that hot path predictions must be made early in 
order to control the rising cost of missed opportunity tha ... 



Machine-adaptable dynamic binary translation 

David Ung, Cristina Cifuentes 

January 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN workshop on 

Dynamic and adaptive compilation and optimization DYNAMO '00, volume 

35 Issue 7 

Publisher: ACM Press 

Full text available- f 51 ! df(1 23 MB) Additional Information: full citation , abstract , references , citings , index 

' ^ terms 

Dynamic binary translation is the process of translating and optimizing executable code 
for one machine to another at runtime, while the program is "executing" on the target 
machine. 



Dynamic translation techniques have normally been limited to two particular machines; a 
competitor's machine and the hardware manufacturer's machine. This research provides 
for a more general framework for dynamic translations, by providing a framework based 
on specifications of machines that ... 

Keywords: binary translation, dynamic compilation, dynamic execution, emulation, 
interpretation 



A hardware mechanism for dynamic extraction and relayout of program hot spots 

Matthew C. Merten, Andrew R. Trick, Erik M. Nystrom, Ronald D. Barnes, Wen-mei W. Hmu 
May 2000 ACM SIGARCH Computer Architecture News , Proceedings of the 27th 

annual international symposium on Computer architecture ISCA '00, volume 

28 Issue 2 

Publisher: ACM Press 

Full text available* fiCl pdf(320. 13 KB) Additional Information: full citation , abstract , references , citings , index 

' terms 



This paper presents a new mechanism for collecting and deploying runtime optimized 
code. The code-collecting component resides in the instruction retirement stage and lays 
out hot execution paths to improve instruction fetch rate as well as enable further code 
optimization. The code deployment component uses an extension to the Branch Target 
Buffer to migrate execution into the new code without modifying the original code. No 
significant delay is added to the total execution of the program ... 

Improving Java performance using hardware translation 

Ramesh Radhakrishnan, Ravi Bhargava, Lizy K. John 

June 2001 Proceedings of the 15th international conference on Supercomputing 

Publisher: ACM Press 

Full text available- t xl Pdf(254 91 KB) Additional Information: full citation , abstract , references , citings, index 

[aj.h a— terms 

State of the art Java Virtual Machines with Just-In-Time (JIT) compilers make use of 
advanced compiler techniques, run-time profiling and adaptive compilation to improve 
performance. However, these techniques for alleviating performance bottlenecks are more 
effective in long running workloads, such as server applications. Short running Java 
programs, or client workloads, spend a large fraction of their execution time in 
compilation instead of useful execution when run using JIT compilers. In ... 

Novel ideas: Performance characterization of a hardware mechanism for dynamic 
optimization 

Brian Fahs, Satarupa Bose, Matthew Crum, Brian Slechta, Francesco Spadini, Tony Tung, 
Sanjay J. Patel, Steven S. Lumetta 

December 2001 Proceedings of the 34th annual ACM/IEEE international symposium 

on Microarchitecture 

Publisher: IEEE Computer Society 

Full text available: « ifjH 

Tg} pdf(1.31 MB) ^ Additional Information: full citation , abstract , references , citings 

Publisher Site 

We evaluate the rePLay microarchitecture as a means for reducing application execution 
time by facilitating dynamic optimization. The framework contains a programmable 
optimization engine coupled with a hardware-based recovery mechanism. The 
optimization engine enables the dynamic optimizer to run concurrently with program 
execution. The recovery mechanism enables the optimizer to make speculative 
optimizations without requiring recovery code. We demonstrate that a rePLay configuration 
performing 
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Increasin g the size of atomic instruction blocks using control flow assertions 

Sanjay J. Patel, Tony Tung, Satarupa Bose, Matthew M. Crum 

December 2000 Proceedings of the 33rd annual ACM/IEEE international symposium 

on Microarchitecture 

Publisher: ACM Press 

Full text available: ■g] pdf (140.81 KB) 

||]J3s( 646.25 KB) . Additional Information: full citation , references , citing s, index terms 

Publisher Site 




Overcoming the challenges to feedback-directed optimization (Keynote Talk) 

Michael D. Smith 

January 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN workshop on 

Dynamic and adaptive compilation and optimization DYNAMO '00, volume 

35 Issue 7 

Publisher: ACM Press 

Full text available- f£l pdf(1.33 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

Feedback-directed optimization (FDO) is a general term used to describe any technique 
that alters a program's execution based on tendencies observed in its present or past 
runs. This paper reviews the current state of affairs in FDO and discusses the challenges 
inhibiting further acceptance of these techniques. It also argues that current trends in 



hardware and software technology have resulted in an execution environment where 
immutable executables and traditional static optimizations are ... 



Profile-guided optimization across process boundaries 

Erik Johansson, Sven-Olof Nystrom 

January 2000 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN workshop on 

Dynamic and adaptive compilation and optimization DYNAMO '00, volume 

35 Issue 7 

Publisher: ACM Press 

Full text available- f£l Pdf(91 1 89 KB) Additional Information: full citation , abstract , references , citings , index 

terms 

We describe a profile-driven compiler optimization technique for inter-process 
optimization, which dynamically inlines the effects of sending messages. Profiling is used 
to find optimization opportunities, and to dynamically trigger recompilation and 
optimization at run-time. We apply the optimization technique on the concurrent 
programming language ERLANG, letting recompilation take place in a separate ERLANG 
process, and taking advantage of the facilities provided by ERLANG to dynami 
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Binary translation and architecture convergence issues for IBM system/390 

Michael Gschwind, Kemal Ebcioglu, Erik Altman, Sumedh Sathaye 

May 2000 Proceedings of the 14th international conference on Supercomputing 

Publisher: ACM Press 

Full text available: Q pdf(1.44 MB) Additional Information: full citation , abstract , references , index terms 

We describe the design issues in an implementation of the ESA/390 architecture based on 
binary translation to a very long instruction word (VLIW) processor. During binary 
translation, complex ESA/390 instructions are decomposed into instruction "primitives" 
which are then scheduled onto a wide-issue machine. The aim is to achieve high 
instruction level parallelism due to the increased scheduling and optimization 
opportunities which can be exploited by binary translation software ... 



An evaluation of staged run-time optimizations in Dy C 

Brian Grant, Matthai Philipose, Markus Mock, Craig Chambers, Susan J. Eggers 
May 1999 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 1999 conference 

on Programming language design and implementation PLDI '99, Volume 34 

Issue 5 

Publisher: ACM Press 

Full text available- pdf(1.54 MB) Additional Information: full citation , abstract , references , citings, index 

• |a|h terms 

Previous selective dynamic compilation systems have demonstrated that dynamic 
compilation can achieve performance improvements at low cost on small kernels, but they 
have had difficulty scaling to larger programs. To overcome this limitation, we developed 
DyC, a selective dynamic compilation system that includes more sophisticated and flexible 
analyses and transformations. DyC is able to achieve good performance improvements on 
programs that are much larger and more complex than the kernels. We ... 



Techniques for obtaining high performance in Java programs 

Iffat H. Kazi, Howard H. Chen, Berdenia Stanley, David J. Lilja 
September 2000 ACM Computing Surveys (CSUR), Volume 32 issue 3 

Publisher: ACM Press 

Full text available- fifl pdf(816 13 KB) Additiona l Information: full citation , abstract , references , citings, index 

^ terms 

This survey describes research directions in techniques to improve the performance of 
programs written in the Java programming language. The standard technique for Java 
execution is interpretation, which provides for extensive portability of programs. A Java 
interpreter dynamically executes Java bytecodes, which comprise the instruction set of 
the Java Virtual Machine (JVM). Execution time performance of Java programs can be 
improved through compilation, possibly at the expense of portabili ... 



Keywords: Java, Java virtual machine, bytecode-to-source translators, direct compilers, 
dynamic compilation, interpreters, just-in-time compilers 



16 Adaptive optimization in the Jalapeno JVM Q 

vg>v Matthew Arnold, Stephen Fink, David Grove, Michael Hind, Peter F. Sweeney 

>< October 2000 ACM SIGPLAN Notices , Proceedings of the 15th ACM SIGPLAN 

conference on Object-oriented programming, systems, languages, and 

applications OOPSLA '00, Volume 35 Issue 10 

Publisher: ACM Press 

Full text available: fiW(716.90 KB) Additional Information: full citation , abstract, references , citings, index 

^ terms 

Future high-performance virtual machines will improve performance through sophisticated 
online feedback-directed optimizations, this paper presents the architecture of the 
Jalapeño Adaptive Optimization System, a system to support leading-edge virtual 
machine technology and enable ongoing research on online feedback-directed 
optimizations. We describe the extensible system architecture, based on a federation of 
threads with asynchronous communication. We present an implementation oft ... 

17 Partial evaluation as a means for inferencing data structures in an applicative Q 
<g> lang uage: a theory and implementation in the case of prolog 

^ H. Jan Komorowski 

January 1982 Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles 

of programming languages 

Publisher: ACM Press 

Full text available: ^]| pdf(1.24 MB) Additional Information: full citation , abstract , references , citings 

An operational semantics of the Prolog programming language is introduced. Meta-IV is 
used to specify the semantics. One purpose of the work is to provide a specification of 
an implementation of a Prolog interpreter. Another one is an application of this 
specification to a formal description of program optimization techniques based on the 
principle of partial evaluation. Transformations which account for pruning, forward data 
structure propagation and opening (which al ... 

18 On the development of a site selection optimizer for distributed and parallel database Q 
<g> systems 

Fotis Barlos, Ophir Frieder 

December 1993 Proceedings of the second international conference on Information 

and knowledge management 

Publisher: ACM Press 

Full text available: ^| pdf(1.11 MB) Additional Information: full citation , references , index terms 
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DSL implementation using stagin g and monads Q 

Tim Sheard, Zine-el-abidine Benaissa, Emir Pasalic 

December 1999 ACM SIGPLAN Notices , Proceedings of the 2nd conference on 

Domain-specific languages PLAN '99, volume 35 issue l 
Publisher: ACM Press 

Full text available: ffi pdf(923.07 KB) Additional Information: full citation , abstract, references , citings, index 

— terms 

The impact of Domain Specific Languages (DSLs) on software design is considerable. They 
allow programs to be more concise than equivalent programs written in a high-level 
programming languages. They relieve programmers from making decisions about data- 
structure and algorithm design, and thus allows solutions to be constructed quickly. 
Because DSL's are at a higher level of abstraction they are easier to maintain and reason 
about than equivalent programs written in a high-level language, and 
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20 O ptimising hot paths in a dynamic binary translator Q 

^ David Ung, Cristina Cifuentes 

March 2001 ACM SIGARCH Computer Architecture News. Volume 29 issue i 
Publisher: ACM Press 



Full text available: Qpdf(890.10 KB) Additional Information: full citation , abstract , citings , index terms 



In dynamic binary translation, code is translated "on the fly" at run-time, while the user 
perceives ordinary execution of the program on the target machine. Code fragments that 
are frequently executed follow the same sequence of flow control over a period of time. 
These fragments form a hot path and are optimised to improve the overall performance of 
the program. Multiple hot paths may also exist in programs. A program may choose to 
execute in one hot path for some time, but later switch to anot ... 

Keywords: binary translation, dynamic compilation, dynamic execution, run-time 
profiling 
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